Search CORE

41 research outputs found

Recommended from our members

A constraint based structure description language for Biosequences

Author: Eidhammer I
Gilbert D
Grindhaug SH
Jonassen J
Ratnayake R
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2001
Field of study

Brunel University Research Archive

A simple and fast heuristic for protein structure comparison

Author: A Caprara
A Caprara
A May
A Murzin
A Zemla
B Thiruv
D Barthel
D Fischer
D Goldman
D Pelta
D Zhi
David A Pelta
DM Strickland
G Lancia
G Lancia
H Liisa
I Eidhammer
I Shindyalov
J Leluk
Juan R González
L Chew
L Holm
L Holm
Marcos Moreno Vega
N Krasnogor
N Krasnogor
N Leibowitz
P Bourne
P Hansen
P Hansen
P Koehl
R Development Core Team
RA Laskowski
W Taylor
W Xie
W Xie
Z Aung
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Background Protein structure comparison is a key problem in bioinformatics. There exist several methods for doing protein comparison, being the solution of the Maximum Contact Map Overlap problem (MAX-CMO) one of the alternatives available. Although this problem may be solved using exact algorithms, researchers require approximate algorithms that obtain good quality solutions using less computational resources than the formers. Results We propose a variable neighborhood search metaheuristic for solving MAX-CMO. We analyze this strategy in two aspects: 1) from an optimization point of view the strategy is tested on two different datasets, obtaining an error of 3.5%(over 2702 pairs) and 1.7% (over 161 pairs) with respect to optimal values; thus leading to high accurate solutions in a simpler and less expensive way than exact algorithms; 2) in terms of protein structure classification, we conduct experiments on three datasets and show that is feasible to detect structural similarities at SCOP's family and CATH's architecture levels using normalized overlap values. Some limitations and the role of normalization are outlined for doing classification at SCOP's fold level. Conclusion We designed, implemented and tested.a new tool for solving MAX-CMO, based on a well-known metaheuristic technique. The good balance between solution's quality and computational effort makes it a valuable tool. Moreover, to the best of our knowledge, this is the first time the MAX-CMO measure is tested at SCOP's fold and CATH's architecture levels with encouraging results. Software is available for download at http://modo.ugr.es/jrgonzalez/msvns4maxcmo webcite.This work is supported by Projects HeuriCosc TIN2005-08404-C04-01, HeuriCode TIN2005-08404-C04-03, both from the Spanish Ministry of Education and Science. JRG acknowledges financial support from Project TIC2002-04242-C03-02. Authors thank N. Krasnogor and ProCKSi project (BB/C511764/1) for their support

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Repositorio Institucional Universidad de Granada

Using least median of squares for structural superposition of flexible proteins

Author: A Atkinson
A Godzik
AM Lesk
B Horn
D Fass
D Flower
D Theobald
D Theobald
E Buck
E Coutsias
H Berman
I Eidhammer
I Luque
I Shindyalov
J Moult
K Damm
K Sumathi
Karthik Ramani
L Holm
L Kavraki
M Gerstein
M Shatsky
MA Fischler
N Echols
O Carugo
P Bourne
P Rousseeuw
R Chiang
R Diamond
R Maiti
R Page
RH Lathrop
S Fleishman
S Kearsley
S Tilley
S Wallin
T Pawson
T Perkins
V Choi
V Hilser
V Maiorov
W Kabsch
W Kabsch
W Krebs
W Wriggers
Y Ye
Y Zhang
Y Zhang
Yi Fang
Yu-Shen Liu
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background The conventional superposition methods use an ordinary least squares (LS) fit for structural comparison of two different conformations of the same protein. The main problem of the LS fit that it is sensitive to outliers, i.e. large displacements of the original structures superimposed. Results To overcome this problem, we present a new algorithm to overlap two protein conformations by their atomic coordinates using a robust statistics technique: least median of squares (LMS). In order to effectively approximate the LMS optimization, the forward search technique is utilized. Our algorithm can automatically detect and superimpose the rigid core regions of two conformations with small or large displacements. In contrast, most existing superposition techniques strongly depend on the initial LS estimating for the entire atom sets of proteins. They may fail on structural superposition of two conformations with large displacements. The presented LMS fit can be considered as an alternative and complementary tool for structural superposition. Conclusion The proposed algorithm is robust and does not require any prior knowledge of the flexible regions. Furthermore, we show that the LMS fit can be extended to multiple level superposition between two conformations with several rigid domains. Our fit tool has produced successful superpositions when applied to proteins for which two conformations are known. The binary executable program for Windows platform, tested examples, and database are available from <url>https://engineering.purdue.edu/PRECISE/LMSfit</url>.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Purdue E-Pubs

Linear-time protein 3-D structure searching with insertions and deletions

Author: ACR Martin
AI Jewett
B Zhu
C Gergely
CH Chionh
D Bu
D Goldman
DG Corneil
DW Eggert
E Krissinel
F Zu-Kang
G Navarro
GH Golub
H Hasegawa
HA Kramers
HM Berman
I Eidhammer
IN Shindyalov
Jesper Jansson
JT Schwartz
KS Arun
Kunihiko Sadakane
L Holm
M Comin
M Shatsky
P Koehl
PG de Gennes
PJ Flory
RH Boyd
RH Lathrop
T Shibuya
T Shibuya
Tetsuo Shibuya
W Kabsch
W Kabsch
WR Taylor
Z Aung
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background Two biomolecular 3-D structures are said to be similar if the RMSD (root mean square deviation) between the two molecules' sequences of 3-D coordinates is less than or equal to some given constant bound. Tools for searching for similar structures in biomolecular 3-D structure databases are becoming increasingly important in the structural biology of the post-genomic era. Results We consider an important, fundamental problem of reporting all substructures in a 3-D structure database of chain molecules (such as proteins) which are similar to a given query 3-D structure, with consideration of indels (<it>i.e.</it>, insertions and deletions). This problem has been believed to be very difficult but its exact computational complexity has not been known. In this paper, we first prove that the problem in unbounded dimensions is NP-hard. We then propose a new algorithm that dramatically improves the average-case time complexity of the problem in 3-D in case the number of indels <it>k </it>is bounded by a constant. Our algorithm solves the above problem for a query of size <it>m </it>and a database of size <it>N </it>in average-case <it>O</it>(<it>N</it>) time, whereas the time complexity of the previously best algorithm was <it>O</it>(<it>Nm</it><it>k</it>+1). Conclusions Our results show that although the problem of searching for similar structures in a database based on the RMSD measure with indels is NP-hard in the case of unbounded dimensions, it can be solved in 3-D by a simple average-case linear time algorithm when the number of indels is bounded by a constant.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

An efficient algorithm for protein structure comparison using elastic shape analysis

Author: A Srivastava
A. Rai
AG Murzin
AL Morris
AS Konagurthu
BJ Grant
CA Orengo
CA Orengo
D. C. Mishra
DG Kendall
E Klassen
F Domingues
G Mayr
GF Schenk
H Hasegawa
HM Berman
I Eidhammer
I Friedberg
I Wohlers
IN Shindyalov
IN Shindyalov
J Kyte
J Laborde
JD Thompson
JM Sauder
JM Zimmerman
K. K. Chaturvedi
L Holm
L Holm
L Lo Conte
M Levitt
M Menke
M Novotny
MF Perutz
R Grantham
R Kolodny
S Salem
S. B. Lal
S. N. Rai
S. Srivastava
SC Li
U. B. Angadi
W Liu
W Mio
Y Ye
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

A new method for identification of protein (sub)families in a set of proteins based on hydropathy distribution in proteins

Author: Aasland R.
Eidhammer I.
Panek J.
Publication venue: 'Wiley'
Publication date: 01/01/2005
Field of study

Structural similarity among proteins is reflected in the distribution of hydropathicity along the amino acids in the protein sequence. Similarities in the hydropathy distributions are obvious for homologous proteins within a protein family. They also were observed for proteins with related structures, even when sequence similarities were undetectable. Here we present a novel method that employs the hydropathy distribution in proteins for identification of (sub)families in a set of (homologous) proteins. We represent proteins as points in a generalized hydropathy space, represented by vectors of specifically defined features. The features are derived from hydropathy of the individual amino acids. Projection of this space onto principal axes reveals groups of proteins with related hydropathy distributions. The groups identified correspond well to families of structurally and functionally related proteins. We found that this method accurately identifies protein families in a set of proteins, or subfamilies in a set of homologous proteins. Our results show that protein families can be identified by the analysis of hydropathy distribution, without the need for sequence alignment. (C) 2005 Wiley-Liss, Inc

University of Queensland eSpace

Recommended from our members

Approaches to the automatic discovery of patterns in biosequences

Author: Brazma A
Eidhammer I
Gilbert D
Jonassen I
Publication venue: 'Mary Ann Liebert Inc'
Publication date: 01/01/1998
Field of study

This paper surveys approaches to the discovery of patterns in biosequences and places these approaches within a formal framework that systematises the types of patterns and the discovery algorithms. Patterns with expressive power in the class of regular languages are considered, and a classification of pattern languages in this class is developed, covering the patterns that are the most frequently used in molecular bioinformatics. A formulation is given of the problem of the automatic discovery of such patterns from a set of sequences, and an analysis is presented of the ways in which an assessment can be made of the significance of the discovered patterns. It is shown that the problem is related to problems studied in the field of machine learning. The major part of this paper comprises a review of a number of existing methods developed to solve the problem and how these relate to each other, focusing on the algorithms underlying the approaches. A comparison is given of the algorithms, and examples are given of patterns that have been discovered using the different methods

Brunel University Research Archive

Samira-VP: A simple protein alignment method with rechecking the alphabet vector positions

Author: Eidhammer I
Gibrat J
Mohd Saberi Mohamad
Razmara J
Razmara J
Safaai Deris
Samira Fotoohifiroozabadi
Publication venue: 'World Scientific Pub Co Pte Lt'
Publication date
Field of study

Crossref

Geometric Suffix Tree: A New Index Structure for Protein 3-D Structures

Author: D. Gusfield
D.W. Eggert
E. Ukkonen
E.M. McCreight
H.M. Berman
I. Choi
I. Eidhammer
J.T. Schwartz
K. Kedem
K.S. Arun
L.P. Chew
T. Shibuya
Publication venue: Springer
Publication date: 01/01/2006
Field of study

Abstract. Protein structure analysis is one of the most important research issues in the post-genomic era, and faster and more accurate query data structures for such 3-D structures are highly desired for research on proteins. This paper proposes a new data structure for indexing protein 3-D structures. For strings, there are many efficient indexing structures such as suffix trees, but it has been considered very difficult to design such sophisticated data structures against 3-D structures like proteins. Our index structure is based on the suffix trees and is called the geometric suffix tree. By using the geometric suffix tree for a set of protein structures, we can search for all of their substructures whose RMSDs (root mean square deviations) or URMSDs (unit-vector root mean square deviations) to a given query 3-D structure are not larger than a given bound. Though there are O(N 2) substructures, our data structure requires only O(N) space where N is the sum of lengths of the set of proteins. We propose an O(N 2) construction algorithm for it, while a naive algorithm would require O(N 3) time to construct it. Moreover we propose an efficient search algorithm. We also show computational experiments to demonstrate the practicality of our data structure. The experiments show that the construction time of the geometric suffix tree is practically almost linear to the size of the database, when applied to a protein structure database.

CiteSeerX

Crossref

Le syndrome hépato-pulmonaire

Author: A. Murzin
A. Zemla
C.A. Orengo
C.A. Orengo
G. Mayr
I. Eidhammer
I. Eidhammer
I.N. Shindyalov
J. Zhu
J.T. Chwartz
L. Holm
M. Gerstein
M. Milik
M. Shatsky
M. Shatsky
M.R. Garey
N.N. Alexandrov
R. Kolodny
S. Subbiah
S.B. Needleman
S.C. Flores
U. Emekli
W. Kabsch
W. Wriggers
X. Yuan
Y. Lindqvist
Y. Ye
Publication venue
Publication date: 01/01/1998
Field of study

Crossref

Open Repository and Bibliography - Liège